4 



DOCUMENT RESUME 



ED 226 065 

TITLE . 
INSTITUTION 

SPONS AGENCY 
PUB DATE 
NOTE 

.PUB TYPE 



EDRS PRICE 
DESCRIPTORS 



TM 830 112 



IDENTIFIERS 



Testing in the Schools: What Does It Mean? 

Vermont State Dept. of Education,, Montpelier. Div. of 

Federal Assistance. 

Department of Education, Washington, DC. 

May 81 

23p. 

Guides - Non-'Classroom Use (055) — Reports - 
Descriptive (141) 

MFOl/PCpl Plus Postage. 

Achievement Tests; Criterion Referenced Tests; 
Diagnostic Tests; Elementary Secondary Education; 
Intelligence Tests; Norm Referenced Tests; *Scores; 
*SJtandardized Tests; *Testing; *Test Interpretation; 
Test Theory 

Elementary Secondary Education Act Title I 



ABSTRACT a 

Because testing, in many different forms, currently 
plays such an important role in education, Elementary Secondary 
Education Act, Title I, the Division of Federal Assistance in the . 
Vermont State Department of Education/ prepared this brochure to 
present a general introduction to terms and phrases commonly used in 
testing and to highlight some of the .advantages and disadvantages of 
intelligence tests, achievement tests, and diagnostic tests. The 
difference between, as well as the advantages and disadvantages of, 
norm-referenced and criterion-referenced tests are discussed. The 
"meaning" of five kinds of test scores are presented: raw scores, 
grade equivalents, percentiles, stanines, and normal curve 
equivalents. While this pamphlet attempts to provide an overview on 
testing, it also-points out that testing can be a complicated process 
that requires a great deal of careful consideration before 
conclusions can be drawn. (Author/PN) 



***************************************** 

* Reproductions supplied by EDRS are the best that can be made * 

* - " from the original document. 

*********************************************************************** 



eric 



U.S. DEPARTMENT OF EDUCATION 

NATIONAL INSTITUTE Of EDUCATION 

EDUCATIONAL RESOURCES INFORMATION 

CENTER tlMC) 
X Thi> dotu*ment has been repjoduced dt 

received ^on\ Hie person or orqaniMtion 

originating it , 

Mmoi changes havr >««t*n nude U .mprove 
fepfN>duc!ion quality 

• Points of vipvv of wpimonj, stated "> this docu 
ment do not net t;ss«irtly represent official NlE 
position of poli( V 



IN THE 
SCHOOLS: 

What Does It Mean? 



"PERMISSION TO REPRODUCE THIS 
MATERIAL HAS BEEN GRANTED BY 



TV JoStVy\ 



Published By 
ESEA Title I 

Division of Federal Assistance 
Vermont Department of Education 
Montpeller, Vermont 05602 



TO THE EDUCATIONAL RESOURCES 
INFORMATION CENTER (ERIC) " 



TESTING 

IMTHE 

SCHOOLS: 



What Does It Mean? 



Published By 
ESEA, Tiflel ~~ 

Division of Federal Assistance 
Vermont Department of Education 
Montpelier, Vermont 05602 



X 



May 1981 



\ 



9 

ERIC 



4 



STATE OF VERMONT 

Governor-Richard A. Snelling 

Commis%ionei<of Education 
Robert A. Withey 

Deputy Commissioner of Education 
Edward J. Fabian 




STATE BOARD OF 

Allen MaVtm. Chairman 
Alof Carlspn 
Viola Luginbuhi 
Louise Swajnbank 
A Shirley Tyler 
Thomas P Whalen 
Uynn T Wood 



EDUCATION 

Essex 
Proctor 
South Burlington 
St Johnsbury 
Brattleboro 
Arlington 
St Albans 



0 



TABLE OF CONTENTS 

THE PURPOSE OF THIS PAMPHLET 1 . 

* WHY DO WE TEST-? ........ ; .- y ' ■ 2 

WHAT ARE THE MOST COMMON TYPES OF TE^S ' 

AND WHAT DO THEY MEASURE 9 - 3 

IQ 01 Intelligence Tests — - ------ 

Achievement Tests .... 

Diagnostic Tests — 

WHAT IS THb DIFFERENCE BETWEEN NORM-REFERFNCED AND 
CRITERION-REFERENCED TESTS 9 ....... .V, - ■■ 7 

WHAT DO TEST SCORES'MEAN 9 - . ■ v ■ J J 

Raw Scores . . , ■ ■ " 

Grade^Ewuivalents ■ — - - 

Percentiles — - - 

Stanines - — • 

Normal Curve Equivalent (NCE) — . ■ * 

TESTING IS USEFUL, BUT ■ • • « * - ' ■ 14 

WHAT IS THE VERMONT BASIC COMPETENCY PROGRAM 9 . o . .16 
IN SUMMARY .. ■ ............ -17 

A 



THE PURPOSE OF THIS PAMPHLET 

local students tested above the national 
average tn reading'* 

Thaxtei School piepares to ddmimstet 
diagnostic test batteiy to students" 

SAT .scoies ^decline for the second 
stiatghl year ' 

" Ann scored at the 50th percentile m 
r * reading and the 95th percentile in math- 

emancs" 

Tom ts eight months below grade level 
in leading" 

, ' Jon gamed A2 NCEs in reading r 

We hve in a society that ^extremely oriented toward testing Tests 
aie used to report progress, to compare performance, to determine 
advancemt it, and to judge success or failure Testm'g. jn one form or 
another appears in almost every phase of life Testing occurs more 
frequently, and probably carries more weight, in the field of education 
than m any other field. 

Because testing in many different forms, currently plays such an 
important tot* in education. ESEA Title I. the Division of Federal 
Assistance in the Vermont State Department of Education, believed it 
was important to prepare this brochure to present a general mtroduc- 
hon to terms and phrases commonly used iq testing and to highlight 
some of the advantages and disadvantages of certain types of tests 

If after reading this pamphlet, you desire additional information ©r 
clarification about testing, please contact either your local school 
administrator or the Evaluation Consultant, Division of Federal Assis- 
tance Vermont Department of Education. Montpeher. Vermont 05602 
Phone (302)828-3124 



V 



ERIC 



WHY DO W£ TEST? 

. WetoteisDictiorury defines the word test' as a set of questions o. 
exorcises foi determining onefe Knowledge or skills, an examination oi 
tual to determine somethings value 

Test.nq o. determin.nq the level of one s knowledge, has always 
ex.sUHl.nAme. .can education In the earliest times testing was usually 
done as pail of the daily school routine Individual students were 
f.equently asked to rec.te passages or do arithmetic problems on smaH 
chalkboards at the.r desks The )udgment of how well the studen, .had 
acquired knowledge or mastered the skills oe.ng taught was left en- 
tirely m the hands of the classroom teacher Problems with this ap- 
proach were obvious. 

• No way existed to tell whether a teacher was being too harsh oi too 
easy in luogmg students 

• Tests were not ' objective . that is the teacher s |udgment played a 
maioi iole and was subject tob.a'ses ranging from student behavior 
to family background 

• Thei^ was no common standard foi comparing the performance of 
oih> student to other students. , 

Problems associated with testing and setting obiect.ve standards m 
education became apparent during World War I when it was found that 
large numbers of young d.aftees,cou[d not read, write, or complete 
simple arithmetic problems ever, though they had competed schoo I It 
became essential to be able to compare one recruit s skills with anotl - 
e« s fo. placement in tuning programs Thus, an effort to develop tests 
to compare one person s knowledge or skills with those of a group of 
similar people was undertaken on a large scale Following World War I . 
the emphasis on testing continued to grow and tesftng for the purpose 
of comparing knowledge and skill levels became a formal part of 
education and an important tool m improving programs and mstruc- 

''Today 'schools test to determine how-well the.r students learn what 
the schools think is being taught. This information is used by teach ,ers^ 
administrators, and educational specialists to help de ermine how 
effective eoucat.onal programs are tor students and What additions or 
changes can be made that will most benefit the student 

Schools use a wide variety of tests .rvan attempt to determine how 
w«* their students are doing. These range from teacher developed 
clatsroom tests designed to measure the content of .daily esson to 
more sophisticate*! tests developed by large commercial test devel- 

"° OmermaTteTher made tests, the most frequently administered 
tests aie intelligence (IQ) tests, ach.evemen^tests, and diagnostic 
tests The next section of this pamphlet will examine the advantages 
and disadvantages of each type of test 




WHAT ARE THE MOST COMMON 
TYPES OF TESTS AND WHAT DO 
* ) THEY MEASURE? 

Thenjiie many different types of te'sts av ulable to measure knowl- . 
*edg*ar/d skills The most frequently used test in the classroom is the 
- one tbat is developed and administered by the teacher to measure 
soedfic information he or she has recently taught In many respects 
jrfese are the most important tests a child will take because they allow a 
/ teacher to monitor educational progress on a daily basis However, 
because teachers must construct and grade >o many of these te%ts. 
some questions may be poorly phrased and ei rors may occur in scor- 
ing Furthermore, teacber-made tests do not permit comparisons of a 
' students performance to other students outside of that particular 
classroom For these reasons, commercial publishers hire profession- 
al test writers to develop more sophisticated standardized tests and 
scoring systems ' A 

The most commonly used commercial tests ire intelligence tests, 
achievement tests, and d.a'gndst.c tests Each of these types of tests 
are used for different-reasons The tests, the reasons for using them. 
and the»*ad vantages and disadvantages will be described below 

IQ or Intelligence Tests ' ♦ 

IQ' stands for intelligence quotient Intelligence tests aie designed 
to measure a person s potential for learning compaied to other peo- 
ple his or her own aqe Intelligence tests are r?ot meant to measure 
specific knowledge Intelligence test scores tend to be fairly constant 
from year to year but do vary as a function of f.^v motivated a student is 
to do well on the test. 01 how well he or she is feeling on a particular 
day A child s potential for learning may even change as a lesult of 
changes in his or her env.ronment or the kinds of educational expe- 
riences he or she encounters IQ tests are not used in t'StA litle 

testtnq . , „ 

Then is a great deal of controvero < surrounding the issue of intelli- 
gence testing As a result. IQ scores are used much less frequent^ now 
than they were in the past Experts caa stilVfiot agree on a definition of 
intelligence, whether or not thefe aredifferent kinds of intelligence, the 
degree to which intelligence is hereditary, the extent to which tests 
actually measure'mtelligence, the amount of cultural bias in the tests, 
the stabthty of test scores "over time, and the weight which should-bo 
given to IQ scores in an overall evaluation of the child It is important. 

. therefore that intelligence scores be interpreted by people who are 
aware of the controversies surrounding these tests and who have other 

' information available about the child 



Remember mtetltqence tests. like-all other tests, are only one indica- 
tor of a student* mtellectual development and should never be the only 
criterion for fudging a student's abilities or skills 

■ ( Advantages 

• IQ scores have been fairly accural* in the past for, 
predicting studenbsuccess^n academic settings 

• • Individual IQ tests can be used as one measure for 
the selection of students for special programs or to 
identify those in need of special assistance Nor- 
' mally an IQ score of 100 "is considered ' average 
The further away from this score (either up or down) 
a child scores, the greater the possibly need for a 
special program or assistance 

Disadvantages 

• Oftentimes intelligence testsdo not take intoaccount 

diffeiences in the cultui al or economic backgrounds 
of students being tested \ 

• The results of IQ tests are often misinterpreted IQ 
scores can and do change. Accurate interpretation 
can be done only by trained experts 

, • Accurate IQ te'stscan only be given to one student 
at a time rather fnan to large groups. This requires 
much more time for testing and adds to the expense 

In addition, most intelligence tests can only be ad- 
ministered by a specialist trained in giving a particu- 
lar intelligence test. 
. Intelligence tests are being used less frequently in todays class- 
rooms than in the past A more commonly userfxommerc.al test today 
'is the achievement test 

'Achievement Tests ■ n t . 

Unlike intelligence tests which weredesigned to measure a pert s 
■ potential for learning' . achievement tests are intended to measure a 
person s general skills in specific academic areas such as vocabulary, 
reading comprehension, arithmetic computation, spelling, social stud- 
,es. 6r, science. These tests are often referred to as "survey tests 
because they do/not try to determine everything a student may know 
about a subject Usually, achievement tests are constructed by test 
publishers using experts from universities, textbook writers, and cur- 
riculum specialists These people examine what is being taught at 
different grade levels across the country in areas such as reading and 
jfethematics Based on this information, these experts develop test 
Questions to measure generally how well students are doing in each 

academic area , , . ,„ 

Achievement tests are commonly used in selecting students to par ; 



ERIC 



* ' 1 U 



4. 



ticiDatl* in ESEA I rile 1 is because student Title I participation is 
contingent upon a student s nol achieving in basic skills on a par with 
his her peers These tests, are also commonly used for reporting the . - 
gams students show ^om BSE A Title-I assistance v< 

While achievement tests are popular among educators they, take ail - 
tests, should be looked upon as only one indicator of how well a % 
student or educational-program is succeeding ^ 

Because achievement tests are general' measures of an academic 
area not all the questions on the test will precisely match what i^being 
taught in yourxhild* school or classroom These tests are useful for 
getting a ge/terafpicture of how well a group of students or an educa- 
tional program is functioning in a specific subject matter area . 

Advantages : * ' 

•.Achievement tests arlow* parents. and educators to, 
compare how well their student's and schools are 
performing io terms of general knowledge in such 
9 areas as reading, mathematics, science, etc with 

other student? and schools from across the country 

• Achievement tests are usually administered to laige 
groups of students and. therefore, are relatively 
^expensive and less time consuming than intelli- 
gence tests ^ 

• Achievement tests can be easily administeied and 
interpreted by classroom teachers 

J . Disadvantages 

• The questions on achievement tests do not measure 
piecisely what is being taught in a classroom or 
school. There may be questions on the test that 
measure information that is not being taught at that 
grade level 01 in that school There also may be 

• , information or skills being taught in a g»ade or 
school for which there are no questions on the \ 
achievement tfest « * 

• Only content areas which can be measured by muUi- 
plB-choice test itqjns are included in the-test. 

• Achievement tests cannot be used to pinpoint the 
- strengths and weaknesses of individual students 

various subject matter or skill areas 
If achievement tests provide general information on how well stu-' 
dents or programs are performing, what test can bemused to pinpoint 
problem areas' 7 The answer is a diagnostic 'est 

Diagnostic Tests , 

While the achievement rest is intended to measure general keowl- 
edgeorskiM in a subject area, the diagnostic test is intended to identify 



specific problem areAs vv.th.n that sub t ect matter For example an , 
achievement* tlst m arkmet.c computation may simply show that the 
student is performing poorly in that general area A diagnostic anu> 
metic test might show mat the student can do everything but carry and 

borrow in addition and Subtraction problems 

» For the most part, diagnostic tests are the most popular test tool used 
by leachers and ectucat.qnal specialists These tests allow teachers to 
identify the specific strengths and weaknesses a student has and then 
plan a precise program to overcome thfese weaknesses Diagnostic 
tests are commonly used for ESEA Title I testing to assist teachers both 
in diagnosing problems ahd in checking for student progress 

Diagnostic tests, unlike achievement tests.are usually designedrfo 
measure only reading or mathematics and not sublet areas such as 
■ sociaTstudicJs and science^ 

' ' Advantages 
■ • Unlike achievement tests, diagnostic tests provide? > ^ 
- ' \ the teacher with a detailed picture of 'a students 
strengths apd weaknesses 

• JBesultsallow the teacher to develop precise mstruc- 
tional'plans to^cortipensate for weaknessesund to" 

build on stierrgW . \ 

• Usuallydiagnostictestscanbeadministeredto^rge 

' groups of students which make them less expensive 
and time consuming than intelligence tests - 

* » 
% - ' Disadvantages ' , , 

x * . Diagnostic tesfsusjjallyprovide information on'only 
* • one subject area at a time, which jfi usuallyeithet 

reading or mathematics. 
* • Diaqnostic tests require more training to admmistei 

• and"interpret than achievement tests but less train- 
' mg tfian intelligenae tests ' 

• •Diagnostic test information is of little value unless 
N j someone can take that information and develop a 

"t specific educational program to overcome student 
weaknesses „ ■ 

\ In summary, let us review what we have learned about tests that are 
commonly used in education, < , 

. Teacher made tests, although extremely .valuable in measuring, 
student progress. ar,e generally less sophisticated and do hot per- 
■ mit comparison's across grades of schools • 
: - Intelligence tests are being used less frequently in schools, usually 
require an expert td administer and interpret, are given to one 
student at a time, and measure "potential for learning . rather thart 
specific knowledge In a particular content area ^ 



• Achievement test* .tie the most commonly used commercial test, 
they measure general knowledgelevels in several subject matter 
areas they give an overalljxcture of how well a student or program 
is doing Not all of the questions on an achievement test match 
what is being taught in a classroom oi school 
. Diagnostic tests are probably the most popular tests among 
4eachers they identify specific student strengthsand weaknesses, 
usually in math or reading, the infoimation can be used to develop 
piec.se educational plans for students to overcome weaknesses 
Fmally tests are only one indication of how well a student or educa- 
t.ona! program is functioning and this information by itself should not 
be the sole criterion lor fudging performance 

Judy W has a record of scoring poorly on any type of commercially 
developed test In talking with hei we find out that the "pressure 
, .e.ited by the way these tests are given frightens her and that she finds 
she can nut concentrate on.the questions How else can we measure 
how wet! Judy is doing .n school' We could io<*k at other test scores to 
see if she has always had this pioblem in taking «ests or whether it is a 
recent development, we could. look at her grades in various sublets, 
talk.nq with her teachers would give an indication of her ability: we 
could examine samples of her daily work, or we could observe her 
performance in the classroom periodically While none of these alter- 
natives to testing give us a broad basis for comparing Judy to other 
student S |they do allow us to make general statements about her skills 
and ability 

a - » 

WHAT IS THE DIFFERENCE 
BETWEEN NORM-REFERENCED 
AND CRITERION-REFFRENCED 
TESTS? 

We have talked about the different types of tests usually found, in 
, schools Since achievement tests are the most commonly used tests in 
education, and most schools use achievement tests at one time or 
another it is important to understand at least two different ways these 
tests aie used to make "comparisons of students % 

We can see how well a student did on an achievement test compared 
to smvlar students from across the cduqtry. In test jdrgon, a test that 
uses this kind of comparison (an individual s performance compared 
to a group s performance) is called a ■norm-referenced test 
' We can also compare how well a student did on an achievement 'est 



bv »u«ch,m, h» a, ho, iwloimanc* to predetermined criteria For 
example, we m.ght say that we expect all fourth grade students to be 
able to pass 9 out of 10 addition problems on a test If the student 
cannot do this, he or she will be given special-assistance In this case, 
the student s performance on the test is being compared to a prede- 
termined expectation or criterion, and, therefore, we call this Kind of 
achievement test a 'criterion-referenced test - 
Let us examine these two Kinds of comparisons from achievement 

^T^^oupwimpanson or norm-referenced test" is simply a test . 
whose questions have been given previously to large ^number s of stu- 
dents from all over the country When test publishers develop an 
achievement test, they ask schools across the nat.on to give ,he test to 
their students. In return, the schools are not charged for the test 
, By giving the test to large numbers of students from urban subur- 
ban and rural schools m regions throughout the country, he es 
publish* hopes to get a cross-seci.on of student scores that- refl ect 
how students of different skill levels perform on the test Based on 
these .cores, the test publisher then has an ,dea of how an "average . 
above average or below average" student will score on this test It IS 
this average or normal population s performance oRthe test against 
which voui schools students are compared 

The actual comparison is made by taking the "^ber o correc 
answers your student received and going to a table in the test booklet 
that provides scores of how well the •normal population did on the 

The reason the test publisher gives the test to so many students 
initially is to get a better picture of how welljhe "typ.ca or average 
Sent can be expected to score and to insure that students from 
varied backgrounds have been included in the ' normal PoputaUon 
For example, it would not be fair to test students from Poor rurah areas 
on an achievement test which had been tried out on students Jrom rich, 
suburban areas We can assume that wealthier suburban schools and 
communities have more resources available both in and out of school.*: 
mauTf ec J how well their students do on tests Rural students of ten do 
Lot have these same resources available Therefore, it » nc > ja.r to 
compare^how well they perform on tests "normed on )ust students 
•from wealthy suburban areas 

On a ' norm-referenced test" then, comparisons can be made 
between your child s performance and that,of other students of similar 
age grade level, or backgiound from across the country Scores on a 
norm-referenced test" may also be used to determ.ne-how well a 
school s or grades performance compares with similar schools or 
grades across the country 

Advantages 

• Norm-referenced tests provide a means o* compac- 



iiHj Hi.' lot mancf of individual students or educa- 
tional progtams tp other similai students or pro- 
grams from acr.oss the country 

• Results from the publisher s normmg group can be 
used as a benchmark for determining how much 
learning has taken place and for identifying areas of 
general weakness in curriculum 

DrsadVanfages 
■ • Norm-referenced tests always judge a student s per - 
formance relative to the performance of the students 
m the normal population ' This norm population 
may not be made up of. similar students and so com- 
parisons would not be fair 

• The results of norm-referenced tests" are often 
over-interpreted and given more weight than other 
indicators of learning While it is fair to use the norm- 
ie»erenced test results for general comparisons, you ^ 
need to remember that it does not perfectly match 
what is being taught in your local schools 

"• Data collected from the normmg group" quickly 
hecomes out-of-date because ot changes in curricu- 
lum as well' as in society as a whole For example, it 
probably is misleading to compare the performance 
in refding of fourth graders in 1979 to how fourth 
grac^rsinthe norm population" did in 1969 Teach- 
ing/methods changed, textbooks were improved, 
students were exposed to vastly different experi- 
ences, social or parental pressures may have 
changed - all these factors probably contributed to 
making the 1969 norms outdated. 
- ■ An alternative to comparing a student against how anothe. group of 
similar students did on a test .s to compare his her f^ormance 
against a predetermined criterion or standa.d This typeof test iscalled 
a criterion-referenced test " 

A criterion-referenced test compares the student to a set of criteria 
rather than to the performance of other students Questions are usually 
arranged to measure skills in some type of sequence from Simplest 
skills to the more complicated If a student can pass all or most of the 
items we say he or she has mastered" the skills If the student begins to 
fa,, questions, then we say that mastery of that skill has not been 
attained, and that is where instruction begins On a criterion-referenced 
test there is usually no comparison with how other students did on the 
samequest.ons Weareinterestedmonlywhateachmdividualstudent 

^Ona math criterion-referenced achievement test, we may expect a 
fourth grader to be able to do 4 of the first 6 items correctly and a fifth 



ERIC 



9 Id 



grade, to <kf 10 oJ »h« fu^t 14 Hems If a student in either grade (ails to 
obta.n he necessary number of correct .terns ("tlnr enter-on ) then a 
special piogiam of instruction can be provided for them T his allows 
sfudtents to learn at their own pace and to be operating at different 
i^^Hmt sub,ect areas without spec.f.c references to grade, 
levels' • e 

Advantages 

• Student achievement is (udged on how well that stu- 
dent performed a desired skill rather than by com- 

' paring his or her performance to another group of 
students functioning at a different grade level. 

• Teachers can obtain information about how well 
individual students have mastered specific skills and 
use that information to develop individual programs 
of study for those students. 

Disadvantages 
«• In most cases, there is no way of knowing how a 
student s score on a criterion-referenced test com- 
pares with a national average. 

• Test lesults usually cannot be summarized in a sim- * 
pie score 

In summary, there are two bas.c kinds of achievement tests norm- 
referenced tests and criterion-referenced tests The difference be- 
tween these tests is how they evaluate a student s performance 

A norm-referenced test compares a students test Pe^manc Mo 
. how well a group or normal population" did on the same te*t This type 
oZl a.iows a school to see how well its students or programs are 
dome compared with similar students or programs acrossfhe country 
H the noTpopulation - is not similar to your students then the com- 
pansons w.il be misleading The results of ESEA Title I assistance are 
usually reported on the basis of norm-referenced tests 

A ^tenon-referenced test compares the student to a set of enter.? 
or expeditions in terms of. skills to be mastered .many ways 
cnterion-referenced tests, like d.agnost.c tests allow *e teacher- to 
see which skius have been mastered and which have not an i to plan 
spec.f.c educational programs for the student Many ESEA Title I 

teachers use criterion-referenced tests to monit ^ u h d / p n ' P S; e o S a S n 
The results of some criterion-referenced tests such as the PRI-DM can 

be conve, ted . by the test publisher, to a report on how well the child did 
in comparison to other students across the country 




/ 



WHAT DO TESt SCORES MEAN? 

We have discussed the types of tests u- ed rn schools and looked 
specifically at achievement tests, but tests are of fro value unfess we 
can understand what the scores produced really mean We will look at 
five kinds of test scores raw scores, grade equivalents, percentiles, 
Stanmes, and NCEs These scores are usually found only on norm- 
referenced tests since they represent different y*ays of comparing one 
student s performance with that of a group of peers 

Raw Scores 

A raw score is simply the number of questions the student answered 
correctly Because the number of questions vary between tests, the 
raw score itself does not have any value in making comparisons For 
example if a fifth grade student gets 21 out of 50 questions correct on a 
reading test and then gets 1 2 questions out of 25 correct on a math test,, 
wjiat do we know about the student's performance in reading and 
math'' The answer is nothing, since we don't know how hard the 
questions are for a fifth grader We use thre raw score to go to the test 
publisher s tables and convert to scores that let us compare the child s 
performance to that of other fifth graders One of those scores .s the 
grade equivalent 

Grade Equivalents 

Grade equivalent scores are based on a division of the school year 

into nine months , 

End of j 

IppT" 9 Oct - Nov, Oec Jan Feb Mar Apr May June., 
0 • 1 2 3 ,4 5 6 f 7 8 » . 
If a student is functioning at grade level, that student'is demonstrating^ 
the same level of skill or knowledge as other students at that gjfle 
level In other words, he or she is getting the same number ofrprns 
correct on a test a's the average student at the same grade ey* For 
example, a third grade student who is tested in reading . n Oct^r and 

who is said to be scoring at the 3.1 grade level (third gffcte. first 
month) is at grade level A student scoring 3 0 (third grade rpmonths) 

• is sliqhtly below grade level, 3.2 (third grade, second month) ft slightly 
above grade level If a third grade student scores 10.3 (tenth grade, 
third months on a reading achievement test, d'oes thisjriean he or she 
should be in the.tenth grade? No, it simply mej^at^e student is 

" achieving and probably could do more chalrfnglna^rk, though not 
necessarily at trie tenth grade level. The third ^KfTreading test in this 
case was never given to tenth graders The test publisher arrived at the 
tenth grade score through a statistical formula rather than actually 
giving {he test to tenth graders. ' • 



The major disadvantages to the grade equivalent score are that »t, 
cannot be added and averaged accurately and that it assumes that the 
same amount of learning occurs in each month of the school year We 
< know that is hot true Therefore, changes in grade equivalent scores 
are not necessarily accurate inflicators of student progress • 

If grade equivalents are not good indicators of progress what about 
percentiles' 5 

T Jefcent.le is another type of score that is derived from the raw 
score (number of correct responses) The percentile is a scale from 1 to 
99 with the 50th percentile being considered average 

If a student receives a raw score of 20 on his/her reading achieve- 
ment test and we look th.s up in the test publisher s table, we find it 
converts to a percent.le score of 60. What th.s means is that approx.- 
matelv W. of the students in the "norm population scored higner 
' than this student ,n reading and approximately 60% of the students in 
the norm population scdred lower in reading < 

Obv.ously. the higher above the 50th percent.le students score the 
better they are doing, and the further below the 50th percentile they 
score the greater the need is for special assistance 

Although there are problems with adding and averaging percentiles, 
these problems are not as great as they are with grade equiva lerrts 
Percentiles provide a much more accurate description <5f student pro- 
gress than grade equivalents. 

Another commonly used test score is called the stanine 

S Ts!a*rt ne is another .nd.cato'r of a student s rank relative to the norm 
population and .s also der.ved Horn the test publisher's table by using 

the raw score n . . ■ 0 nr o arP 

Stanines are a scale from 1 to 9. Stanine scores of 1. 2. or. J are 
considered f 0 be below average scores, scores of 4. 5. or 6 are average; 
and stanine scores of 7. 8. or 9 are above average scores 

Wh.le the stanjne. unlike the gradr, equivalent and percentile scores 
can be added and averaged accurately, its disadvantage is that « is no 
a very precise indicator of student progress For example, a student 
may get 27 questions correct the ftrsr time he/she takes the test This 
numb r of correct questions may just barely be enough to get a stanine 
. scoTof 3 At the end of the year the student^takes the same test and 
this time gets 32 items correct This number of correct questions is Only 
one away from a stanine score of 4 but it is still one short, so the student 
• still has I stanine score of 3. The student had a stanine scoreo _3 l when 
he or she started and stdl has"a stanine score of 3 at the end of the year. 
Doe tms mean no learning took place? Obv.ously not. because we 
' saw from the example that the student answered more questions cor- 

12 l ° 



rectfv the second time than he or she did the first time What it means is 
simply that the stanme score is not exact enough to accurately reflect 
student progress ^ 



ft 



^t^- 



NCE \* 

A score which is technically very similar to a stanme. but looks like a 
percentile, is the NCE., or normal curve equivalent While stanmes 
range from 1 to 9, NCEs range from 1 to 99 If onh translated stanme 
scores into NCEs, score.- J 1 to 35 NCEs would be considered below * 
average, scores of 34 to 66 would be average, and scores of 67 to 99 

.would be above average Because there are 99 points on the scale 
instead of 9, NCEs provide a much more precise measure of where a 
students, and how much be or she progresses relative to other stu- 
dents Like stances. NCEs can be averaged to provide an accurate 
picture of ' roup performance In fact it was for the purpose of sum- 
marizing )nwide ESEA Title I achievement data that NCEs were 
initially developed NCEs are not often used to report test results of 
individual students, but are used to report the ESEA Tjtle I results that a 
school district has achieved 

The following excerpts on NCE s are taken from Technical Paper No 
2 by G Kasten Tallmadge entitled fntetpretmg NCEs - ESEA Title I 
Evaluating and Reporting System pubhshed in October 1976 by the 
Office of Education 

' NCEs are like petconfiles Both an NCE Of 50 and a percentile of 50 
are exactly average While NCEs do not match percentiles at other 
points (except for 1 anO 99). the analogy is quite useful when trying to 
describe achievement gams measured in NCEsr While it is not strictly 
correct to talk about NCE gams as if they were percentile gams, ft frill 
probably facilitate communication and enhance understanding to do 

, - so This is particularly ttue since most people tend to think of percen- 

9 tiles as if they were an equal-interval scale and would be somewhat 
confused to learn that a gam from percentile 5 to percentile 10isalrrust 
exactly twtce as big as a gam from percsntile 15 to percentile 20 
'An NCE of 50 is at grade level. Regardless of the time of year at which 

' testincj is doneand the grade level tested, a properly derived NCE score 
of 50 will always be the national average for that grade leveJ.and month 
Being average means being exactly at grade level, NCEs below-50 
signal below-average achievement levels or, below-grade-level per- 
« f ormance An NCE of 30 is exactly the same distance below g rade level 
at every grade* while being "a year below grade level" has a different 
meaning at each grade. Finally, an NCE of 30 is always exactly twice as 
far below grade level as an NCE of 40 while being w two years beJow 
grade level" is never twice as much as being orte year below grade level 
(belitve it or not) 1 * . 

An NCE gam of zeto means thai thelitle I project produced no gam 
A zero NCE gam does not mean that the student or group of students 



ERLC 



>3 



te«ned nothing t-iw.-n pretext and pbsttest Thoy almost certainly 
an^td m >re .terns couect.y at the end of the ms trucl.ona per od 
rnin at the beq.nn.ng The zero NCE gam simply means, that the 
Sun Stearmng wa's prec.se.y what woujg; have been- expected had 
(here been no Title I project - m other words it means that the Title 
protect added exactly nothing to the .egular school prog. am 

MNGE gams greater fnan zero are good' Whenever the *ri»*cn 
snows an NCE gam greater than zero, it means that the . Til e ! pupils 
orofmed horn participating in the pro,ect In general the large, the 
NCE gam. the nioreefLtivtthe project. It .snot possible.. however, to 
donate-any, specH.e NCE gam as the cr.ter.on for exemplary or 
outstanding projects A cost-effectiveness criterion seems moieapp o- 
P a e Assuming that the same number of do lars were spent 'or 
example a 4-NCE gain produced in a treatment gnup of 200 pupils 
St be considered as good as an 8-NCE gain produced .n a treat- 
ment group of 100 pupils " fl ,^ ca 
in summary, we now know that raw scores, or the number of items a 
student gets correct on the test have little value for comparison pur- 
TotTbTa^e key to getting other test scores from the publ.shei s 
Sbles Grade ^gmvaLnts are not accurate for descnb.ng student prog- 
ess because they cannot be added or averaged accurately and they 
^opoZvlssJ m e that equal learning takes place in each month of 

SC St?mnes can be averaged accurately but are not precise enough 
mdlaTs oTSudenLprogress GeneraJ.y. the best test scores for 
showing student growth ^progress is the perce n He or ^ N^E 
Finally we should emphas.ze toat caut.on should be used in inter 
• pie rng esTscores. If you. asa parent, areunsure ab o««^eamng o 
vour child's test scores, seek advice from yourschool Remember, test 
score are only one ,nd,c»t>on of a student* performance .We 'Should 
not discount other factors that serve as evidence of student pwtot. 
mance sucn as subject matter grades, examples of the student s daily 
work teacher appraisal, or observation of the student performing in 
class & 

TESTING IS USEFUL, BUT: 

• Tastmg is a very valuable tool for education, parents, and jHecom- 
njniy when it comes tomak.ng decrsions a^pj^JJ 
dents and schools When properly used, test information can help 
• feacbe ^create educational plans designed to overcome areas of aca- 
demic : weaknes or to build on specific strengths Test information can 

where resources should be applied to meet educational needs 

Obviously: test information can also be abused and to prevent this 
we offer the following cautions 



ERIC 



14 - 



0 



Testing .s only onu indication of how well a student or school pro- 
JS performing Other information should be included along 
wrth test mformat.on before ,udgments are made about students and 

Pr ° 9ra To.example..nformat.on about theattitudesandbehav.or of ■ 
the students in school is probably as impo/tarit as how well 
they do on achievement tests Information about the resources 
available to provide the instruction is also important 
, If achievement tests do not match the matenal taught to the student^ 
Ihen STng conclusions from such tests can be mislead.ng. and 
the results will be of little or no value mfl ih*mmica 
For example, if your school uses a traditional mathematics 
cumculum and the new achievement test you have selected 
tests modern math concepts, the test will not -accurate y 
represent how your students are doing in math It -s extremely 
.mportant that the content of the achievement test matches as 
closely as possible what is being taught 
. some students do not perform well oh tests, for them, a test is not an 
accurate reflection of achievement 

Fo. example, some students become so apprehensive when 
' »t comes time to take a test that they "freeze" or "go blank 
There >s no doubt that test.ng cieates pressure on students For 
student who cannot take tests, we must use other indicators 
such as teacher grades, observation of classroom perfor- 
mance, and examples of work. Efforts should also be made 0 
reduce test anxiety for students by telling 
and use of the tests they are taking and providing them with 
mote experience in taking tests * • 

. Do not over-interpret ' test results Be aware of the limitations of the 
test you" are using and the test scores that are being reported 

For example, our fourth grade scored this year at the 48th 
nercent.le in reading but every other year they have been at he 
IZ percentile Something must be wrong- To begin with, the 
48th percentile is not s.gn.ficantly below average in . terms o< 
performance and. for that matter, the 52nd percentile is not that 
high above average Do not generate about the quality of an 
emue school based on the performance of how one grade 
scored on one achievement test arr „rate 
. A test must be properly administered in order to obtain accurate 

\ reSU ' tS For example, g.v.ng an achievement test just after students 
X X have reined from vacation will probably not give you the best 
Xresults Also, test.ng students for long.per.od »^U««dumj| 
the day can lead to poorer sco.es because students get tired 
Teachers should be familiar with directions for g.v.ng he 
,est latest is to be g.ven to students for 30 m.nutes. the 



tt\u her *t\oii\d no! extend the time to 45 minutes because she 
knows »t the students had more time they would do better The 
norm population that originally took the test had only 30 min- 
utes li valid comparisons are to be made, that is all the time 
that can be a»iow.ed Following directions on a test is essential 

Teachers should not teach the specific questions on a test to stu- 
dents Naturally, they may teach the subject matter or content that 
those questions aie intended'to measure 

Since the State ol Vermont has introduced a baste competency 
program that allows school districts the option of determining what 
methods will be used to assess student skill levels (locally developed 
tests, norm-reference tests, criterion-referenced tests, etc ). it seems 
important to provide a brief explanation of that prqgraft] 

WHAT IS THE VERMONT BASIC 
COMPETENCY PROGRAM? 

AH of the schools in the State ol Vermont are probably doing some 
type of testing similar to what has been described in this pamphlet In 
addition all are involved in the Vermont Basic Competency Program 

•The purpose of this program ,s to insure that students graduating from 
Vermont schools have obtained minimal or basic mastery of skills in 

•the areas of reading, writing, listening, speaking, computation, and 

reasoning ■ , , 

The Ba^ic Competency Program allow% the use of a flexible testing 
system The State of Vermont has already stated what the basic skills 
should be in each area, but it leaves the measurement of those skills up 
to each community. Some communities wish to assess these skills via 
norm-referenced or criterion-referenced tests available from commer- 
cial publishers Most communities, however, have chosen to develop 
their own test items. While the Vermont Basic Competency program 
establishes the minimum acceptable level of skill for students and 
schools in rtading. language arts, arithmetic, and reasoning that must 
oe demonstrated before a diploma or advancement is awarded, the 
methods lor measuring these skill areas are left up to the schools 
Schools may elect to use a commercially developed test or to develop 
tests of their own or to combine both approaches. 

While the testing process us<?d by each school district participating 
in the Vermont Basic Competency Program is different and theiefore 
does not permit comparisons to be made % among school districts, it 
does provide the district with specific information about how well its 
students and schools are performing in the basic skill areas identified 
by the state* 



IN SUMMARY ' 

We have examined the kinds of tests that are most frequently used in 
shoots and what various types of test scores mean What we can 
conclude is that testing has been, and probably will continue to be. a 
very integral part of the educational process It is. therefore, important 
for parents and teachers to become more familiar with testing and the 
issues assorted with the use of tests • ' 

While this pamphlet has attempted to provide an overview on testing. 
,t has also pointed out that testing can be a complicated process that 
requires a great deal of careful consideration before conclusions can 

be Tests are a tool that can help pa'rents and teachers improve educa- 
tional prog.ams for students, but tney are only one tool Unless they 
are used wisely, they can easily distort the overall picture of a child s 
educational development ./ 



